9 research outputs found

    Association Analysis Using Set-Based Approaches in the Post-GWAS Era

    Get PDF
    Genotyping arrays have greatly facilitated genetic epidemiological studies into genetic risk factors for numerous complex diseases such as psychiatric disorders. The use of genome-wide association analysis (GWAS) is unequivocally established. More recently, DNA methylation arrays have enabled genome-wide profiling of the methylome, in addition to contemporary genetic epidemiology study design. An example of one such study is the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN) Lipidomics Study, which identified methylation markers (CpG markers) and single nucleotide polymorphisms (SNPs), associated with the change in triglyceride levels after drug intervention. Genotyping and methylation arrays assay several hundred thousand markers; however, single-marker association analysis suffers greatly from the burden of multiple testing. Set-based (SNP or CpG set) association approaches offer great flexibility, thus allowing the joint testing of a set of variants. For instance, a polygenic risk score (PRS) is a set-based approach, which, in addition to the strongly associated SNPs identified by large-scale GWAS, recruits SNPs with moderate to weak effects. The genotype information of the SNP set in the PRS is taken from an independent sample (target sample) and is then weighted by individual SNP effects derived from a relevant GWAS performed on a separate sample (discovery sample) into a cumulative score for each individual in the target sample. The resulting score, based on a SNP set or the PRS, is then regressed on the target phenotype. Such a regression model is evaluated by the amount of variance explained (R2) by the PRS in the target phenotype. Another strategy of set-based association analysis is kernel machine regression (KMR): a semi-parametric regression approach, in which the effects of markers within a set (CpG set or SNP set) are modelled via a kernel function and thus evaluated by a single-component variance test. A kernel function computes pairwise genomic similarity between the individuals, that is, the inner product of a set of variants under analysis, maybe comprising a gene or a biological pathway. For my first article, I performed a simulation study to evaluate the performance of PRS in correlated discovery and target traits by considering various sample sizes of the target sample, namely n=200, 500, and 1000. The PRS for correlated traits can be viewed as a situation of calculating schizophrenia-PRS for psychosocial endophenotypes such as global assessment functioning (GAF) score or positive and negative syndrome scale (PANSS) score. Considering such a situation, I simulated four correlated target traits that had varying degrees of correlation (r2) with the discovery trait, i.e., r2= 1.00, 0.8, 0.6, and 0.4. The results demonstrated that the average R2 estimates by the PRS roughly decreased by the square of the correlation between the target traits. In addition, the range of estimated R2 is most inflated in the sample size of the target trait n=200. Thus, the simulation findings alert researchers conducting clinical studies with endophenotypes to the fact that they need to pay attention to two important factors: first, the sample size of the target trait and secondly, the shared amount of genetic correlation between the target and discovery traits. In my second article, I implemented a KMR approach for set-based association testing of a CpG set. KMR has been successfully employed on SNP sets. In preparation of the second article, I used real and simulated datasets (based on a real dataset) provided by the Genetic Analysis Workshop 20 (GAW20) from the GOLDN study. GOLDN is a longitudinal study with individuals recruited from pedigrees. In my analysis, I only used independent individuals, which restricted the sample size in the real and simulated datasets to n<200. CpG sets were devised using the evidence of association reported by the GOLDN study in the real data set. For simulated datasets, true causal CpGs were provided by GAW20. Thus, I formulated candidate genomic regions of varying lengths while keeping the associated CpG(s) inside the region. The results replicated the evidence of association reported by GOLDN in the real data, and in simulated datasets albeit nominally. Moreover, in the simulated data, causal SNPs exert their full effect on the phenoytpes given when the causal CpG loci had no methylation (B-value=0). Thus, I also considered modelling an interaction term along with the main effects. The results yielded significant association. As part of the discussion, simulation results on the performance of the linear kernel for a CpG set with original (B-values) and logit transformed methylation values (M-values) indicated that logit transformation results in a loss of power. There, I also considered analysing an additive kernel that combines the genotype kernel and the methylation kernel and then tests for association with the phenotype. The initial simulations suggest that an additive kernel with a CpG set including hypo, semi, and hypermethylated sites simultaneously might not improve the model over only including a SNP set. However, it appears fruitful to investigate further the situation in which only one type of methylation state is present in a CpG set

    ADRA2A and IRX1 are putative risk genes for Raynaud's phenomenon

    Get PDF
    Raynaud's phenomenon (RP) is a common vasospastic disorder that causes severe pain and ulcers, but despite its high reported heritability, no causal genes have been robustly identified. We conducted a genome-wide association study including 5,147 RP cases and 439,294 controls, based on diagnoses from electronic health records, and identified three unreported genomic regions associated with the risk of RP (p < 5 × 10-8). We prioritized ADRA2A (rs7090046, odds ratio (OR) per allele: 1.26; 95%-CI: 1.20-1.31; p < 9.6 × 10-27) and IRX1 (rs12653958, OR: 1.17; 95%-CI: 1.12-1.22, p < 4.8 × 10-13) as candidate causal genes through integration of gene expression in disease relevant tissues. We further identified a likely causal detrimental effect of low fasting glucose levels on RP risk (rG = -0.21; p-value = 2.3 × 10-3), and systematically highlighted drug repurposing opportunities, like the antidepressant mirtazapine. Our results provide the first robust evidence for a strong genetic contribution to RP and highlight a so far underrated role of α2A-adrenoreceptor signalling, encoded at ADRA2A, as a possible mechanism for hypersensitivity to catecholamine-induced vasospasms

    ADRA2A and IRX1 are putative risk genes for Raynaud's phenomenon.

    Get PDF
    Raynaud's phenomenon (RP) is a common vasospastic disorder that causes severe pain and ulcers, but despite its high reported heritability, no causal genes have been robustly identified. We conducted a genome-wide association study including 5,147 RP cases and 439,294 controls, based on diagnoses from electronic health records, and identified three unreported genomic regions associated with the risk of RP (p -8). We prioritized ADRA2A (rs7090046, odds ratio (OR) per allele: 1.26; 95%-CI: 1.20-1.31; p -27) and IRX1 (rs12653958, OR: 1.17; 95%-CI: 1.12-1.22, p -13) as candidate causal genes through integration of gene expression in disease relevant tissues. We further identified a likely causal detrimental effect of low fasting glucose levels on RP risk (rG = -0.21; p-value = 2.3 × 10-3), and systematically highlighted drug repurposing opportunities, like the antidepressant mirtazapine. Our results provide the first robust evidence for a strong genetic contribution to RP and highlight a so far underrated role of α2A-adrenoreceptor signalling, encoded at ADRA2A, as a possible mechanism for hypersensitivity to catecholamine-induced vasospasms

    Relating drug response to epigenetic and genetic markers using a region-based kernel score test

    No full text
    Abstract In GAW20, we investigated the association of specific genetic regions of interest (ROIs) with log-transformed triglyceride (TG) levels following lipid-lowering medication using epigenetic and genetic markers. The goal was to incorporate kernels for cytosine-phosphate-guanine (CpG) markers and compare the kernels to a purely parametric model. Post-treatment TG levels were investigated for post-methylation data at CpG sites and region-specific SNPs and adjusted for pre-treatment TG levels and age, in independent individuals only (real data: n = 150; simulated data, replicate 84: n = 111). In both data sets, our single-CpG-marker results using kernels and linear regression were in good agreement. In the real data, we investigated the introns of the CPT1A gene previously reported as associated with TG levels as separate ROIs, and were able to find hints of an association of cg17058475 and cg00574958 with post-treatment TG levels. In the simulated data, we investigated a total of 10 regions, in which the 5 causal and 5 non-causal markers lie, respectively, with increased methylation variances, yielding plausible results for the 3 window sizes. Overall, this indicates that kernels for CpG markers are feasible. An interaction regression model for the causal SNP with the nearest CpG marker identified an effect for the SNPs with the three greatest heritabilities simulated. The simulation model assumed full SNP effect only for unmethylated regions decreasing to zero in the case of full methylation. Thus, in the context of a clear candidate setting, interaction between epigenetic and genetic data may enhance information, albeit nominally, even with small sample sizes. Relieving the burden of multiple testing, developing kernels further to analyze data from multiple omics jointly is well warranted
    corecore